Replace block wait metric with Histogram #424
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Current block wait metric uses a gauge, which is causing spikes in block wait to be lost. So, I have switched it to a histogram instead, which does a much better job at capturing peaks and having a variety of more useful expressions, some of which I list below. These expressions are also customizable.
A more accurate average:
rate(queryapi_runner_block_wait_duration_milliseconds_sum[$__rate_interval]) / rate(queryapi_runner_block_wait_duration_milliseconds_count[$__rate_interval])
The duration under which 95% of block requests were fulfilled
histogram_quantile(0.95, sum(rate(queryapi_runner_block_wait_duration_milliseconds_bucket[$__rate_interval])) by (le))
The % of requests for each indexer where the block wait was under 100ms